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Abstract 



o 

^\i We provide in this paper a fully adaptive penalized procedure to select a covari- 

;_( ance among a collection of models observing i.i.d replications of the process at fixed 

,/^ observation points. For this we generalize the results of |3j and propose to use a 

^ data driven penalty to obtain an oracle inequality for the estimator. We prove that 

y—i this method is an extension to the matricial regression model of the work by Baraud 

^ in p. 

H 

C/^ Keywords: covariance estimation, model selection, adaptive procedure. 

■S 

g 1 Introduction 

,-H Estimating the covariance function of stochastic processes is a fundamental issue in statis- 

^ tics with many applications, ranging from geostatistics, financial series or epidemiology 

(^ for instance (we refer to [TD], |B] or j5] for general references). While parametric methods 

r^ have been extensively studied in the statistical literature (see [5] for a review), nonpara- 

metric procedures have only recently received attention, see for instance |S1 El IH Ej and 
Q references therein. 

CN In [3], a model selection procedure is proposed to construct a non parametric estima- 

tor of the covariance function of a stochastic process under mild assumptions. However 
.^ their method heavily relies on a prior knowledge of the variance. In this paper, we extend 

^ this procedure and propose a fully data driven penalty which leads to select the best 

j^ covariance among a collection of models. This result constitutes a generalization to the 

matricial regression model of the selection methodology provided in [1]. 

Consider a stochastic process (X {t))^^rp taking its values in M and indexed by T C M'^, 
d G N. We assume that E [X (t)] = Vt G T and we aim at estimating its covariance 
function a (s, t) = E [X (s) X (t)] < oo for all t,s E T. We assume we observe Xj (tj) where 
i E {1 . . .n} and j G {1 . . .p}. Note that the observation points tj are fixed and that the 
Xj's are independent copies of the process X. Set Xi = (Xj (ti) , . . . , Xj (tp)) Wi E {I . . . n} 
and denote by S the covariance matrix of X at the observations points S =E (xjX^) = 
[a [tj, '-fej Ji<j<p i<fc<p • 

Following the methodology presented in |3], we approximate the process X by its 
projection onto some finite dimensional model. For this, consider a countable set of 
functions {g\)xi=A which may be for instance a basis of L^ (T) and choose a collection 
of models Ai C V{h). For m G Ai, a finite number of indices, the process can be 
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approximated by 

X(t)^^aA(7A(t). 

Asm 

Such an approximation leads to an estimator which depends on the collection of functions 
m, denoted by S^- Our objective is to select in a data driven way, the best model, i.e. 
the one close to an oracle mo defined as the minimizer of the quadratic risk, namely 

mo G arg mini? (m) = arg minE 

This result is achieved using a model selection procedure. 

The paper falls into the following parts. The description of the statistical framework 
of the matrix regression is given in Section [2j Section |3] is devoted to the main statistical 
results. Namely we recall the results of the estimate given in [3j and prove an oracle 
inequality with a fully data driven penalty. Section |4] states technical results which are 
used in all the paper, while the proofs are postponed to the Appendix. 

2 Statistical model and notations 

We consider an M- valued process X (t) indexed by T a subset of M"^ with expectation equal 
to 0. We are interested in its covariance function denoted by a (s, t) =E,[X (s) X (t)]. 

We have at hand the observations Xi = (Xj (ti) , . . . ,Xi (tp)) for 1 ^ i ^ n where Xj 
are independent copies of the process and tj are deterministic points. We note S G W^^ 
the covariance matrix of the vector Xi. 

Hence we observe 

XixJ = S + f/i, 1 ^ i ^ n (1) 

where f/j are i.i.d. error matrices with expectation 0. We denote by S the empirical 



■X/ f iXj , 



T 



covariance of the sample : S = ^ XlILi 

We use the Frobenius norm || || defined by ||y4|| = Tr (AA^) for all matrix A. Recall 
that for a given matrix A G W^'*, vec{A) is the vector in G W^ obtained by stacking the 
columns of A on top of one another. We denote by A~ the reflexive generalized inverse 
of the matrix A, see for instance in |9] or jTj. 

The idea is to consider that we have a quite good approximation of the process in the 
following form 

^W^5Z«A<7A(t), (2) 

Asm 

where m is a finite subset of a countable set A , {ax)^^jy are random coefficients in M and 
{g\)x(zt^ are real valued functions. We will consider models m among a finite collection 
denoted by M. . 

We note Gm G M^^l™-! where (G'm)jA = 9\ (^j) ^ind a^ the random vector of M'™-! with 
coefficients {a\)^^^. 

Hence, we obtain the following approximations : 

X= {X (ti) , .., X (tp))^ ^ Gmttm 






*-J m '^m '3'j7i ^ m. 



Thus, this point of view leads us to approximate S by a matrix in the subset 

S (G„) = {Gm'^G]^/^ symmetric in mI'"|x|-I} c M^^^ (3) 

Hence, for a model m, a natural estimator for S is given by the projection of S onto 
S (Gm)- We can prove using standard algebra (see in [3J for a general proof) that it has 
the following form : 

(4) 

(5) 



UmSUrr, m E M G 



npxp 



PPXP 



where 

are orthogonal projection matrices. Set 

Dm = Tr {Um ® n^) 

which is the dimension of S (Cm) assumed to be positive, and S^ = 11^211^ the 
projection of S onto this subspace. 

Hence we obtain the model selection procedure defined in [3]. The estimation error 
for a model m & Ai is given by 



E 



I Zj iimZjiim. II ~r 



'-m-^-^^m I 



n 



(6) 



where 



2 ^ Tr((H^0H„,)'l') 
Dm 



^=V [vec[xixj)) . 
Given > 0, it is thus natural to define the penalized covariance estimator S = S^ 



by 



where 



m = arg mm 

meM I n 



n 

-E 



\XiXj^ 2^f 



i=l 



+ pen (m) 



pen [m) 



:i + 



6"^ D 

n 



(7) 



The following result proved in [3] states an oracle inequality for the estimator S. 



Theorem 2.1. Letq > be given such that there exists k > 2(1 + q) satisfying E, \\xixj\\'^ < 
cx). Then, for some constants K (9) > 1 and G' {9, K,q) > we have that 



E 



S-E 



2q 



!/<? 



< 2 



5-1-1 



P D 

K{9) inf ( ||S-H„SH„f + "" 
meM \ n 



m^rn \ ^k r2 



n 



sup 



where 

\meM J 

and 

'^sup = max{(5^ -.meM] . 

However the penalty defined here depends on the quantity 5m which is unknown in 
practice since it rehes on the matrix ^ = V (fec(xx^)). Our objective is to study a 
covariance estimator built with a new penalty involving an estimator of $. 

More precisely, we will replace pen{m) by an empirical version •pen{m)^ where 

pi^(m) = (l + 0)^^^, (8) 

n 

and 

Tr f (n^ ® lim) $ 



Dm 



with $ an estimator of $. 

The objective is to generalize Theorem 2.1 and to construct a fully adaptive penalized 



procedure to estimate the covariance function. 

3 Main result : adaptive penalized covariance estima- 
tion 

Here we state the oracle inequality obtained for the new covariance estimator introduced 
previously. 

Set 

y-i = vec (xixj) ,1 ^ i ^ n, 

which are vectors in MP and denote by S^ec = - Yl^=i Vi their empirical mean. Consider 
the following constant Cj„/ = infmeA^ Tr ((Hm ® H^) $), and assume that the collection 
of models is chosen such that Cj„/ > 0. Set 

1 " 

Tr f (H™ ® H„) $ 



P - 

Dm 



Given ^ > 0, we consider the covariance estimator S = S™ with 



m = arg mm 

meM I n 



n 

-E 






i=l 



pen [m) 



where 



pen [m) 



:i + 



n 



(9) 



Theorem 3.1. Let 1 ^ q > be given such that there exists (5 > max (2(1 + 2g) , 3 + 2q) 
satisfying E 1 1 xx'^ 1 1 < 00 . 

Then, for a constant C depending on 6, /3 and q, we have for n ^ n(/3, 6, Cinf, S), and 
Vfi:G]2(l + 2g);min(/3,2/3-4)[ : 



E 



S-S 



2q 



^C inf ( ||S-S„f + ^^^^^ 



meA^ 



n 



+ - 



c 



n 



Af. 



E 



\xx 



(:2 A 



where 



.2q 



Al = c{9,f3,q){E 






E ^^^^ 



-13/2 



meM 



Ai = c {6, /., q) E Wxx-^l ( Y, s;;:dJ-/'-'-A 

\meM / 



and 



(10) 
(11) 



'^sup = max{(5^ -.meM} 



We have obtained in Theorem 3A_ an oracle inequahty since the estimator S has the 
same quadratic risk as the "oracle" estimator except for an additive term of order O (^) 
and a constant factor. Hence, the selection procedure is optimal in the sense that it 
behaves as if the true model were at hand. 



The proof of this theorem is divided into two parts. First, as in the of Theorem 2.1 
proved in [3], we will consider a vectorized version of the model ([I|. In this technical 
part we will obtain an oracle inequality under some particular assumptions for a gen- 
eral penalty. In a second part, we will prove that our particular penalty verifies these 
assumptions by using properties of the estimator $. 



4 Technical results 



4.1 Vectorized model 

Here we consider the vectorized version of model ([I]). In this case, we observe the following 
vectors in MP : 

yi = fi + ei l^t^n. (12) 



Here t/j corresponds to vec (xjX^) in the model ([I]), /« to vect (S) and €{ to vec{Ui). We 
set f = {fj,...,f^) ,y = {yj, ...,y^) and e = {ej , . . . ,6^) , which are vectors in 

We estimate / by an estimator of the form 

fm = PrnV m E M, 



where Pm is the orthogonal projection onto a subspace Sm of dimension Dm- We note /„ 

|2 _ 1 Y^n ^T . 
In 



Pmf and we consider the empirical norm 
scalar product (■,■)„• 

First we state the vectorized form of Theorem I2.1[ Write 



~ Yl^=i f7 fi with the corresponding 



.2 _Tr(P^(4®<l')) 



/^. 



'^sup = max{(5^ -.meM] . 
Given > 0, define the penalized estimator f = ffh , where 



m = arg mm 



y-fr, 



+ pen (mj 



with 



pen [m) 



:i + 



(5^ D 



Then, the proof of Theorem 2.1 relies on the following proposition proved in [3]: 



Proposition 4.1. ; Lei g > ?)e g'zwen such that there exists n > 2(l + g) satisfying 
E ll^ilP < oo. Then, for some constants K {9) > 1 and C {6, K,,q) > we have that 



E 
where 



f-f 



2q 



1/g 



<2^ 



2 , ^m^rn 



meM \ n 



A 



K r2 



n 



sup 



(13) 



\mGM 



(K/2-l-q) 



The new estimator S defined previously corresponds here to the estimator f = fm , 
where 



m = arg mm 

meM 



y-fr, 



with 



pen (m) = (1 + 



+ pen [m] 



6"^ D 

"m-^m 



n 



and S'^ is some estimator of 6"^. 

Next Proposition gives an oracle inequality for this estimator under new assumptions 



on the model. As Proposition 4.1, it is inspired by the paper [T]. 



Proposition 4.2. Let 1 ^ q > be given such that there exists k > 2 (1 + 2g) satisfying 
E lleilP < oo. 

ForaG]0;l[, set Vt = r\^^M\^'L> (l-«)^m}- 
Assume that 



Al. E 



(^2 



^51. 



A2. P (fi^) ^ C (a) ^ for some 7 ^ ^3^. 
Then, for a constant C depending on k, 9 and q, and we have 



where 



and 



E 



/-/ 



2g 



i/g 



2 , ^m^m 



^C inf ||/-P^/||^ + 

m^M \ n 



+ - 



C 



A? 



Cia) 



'1- 



n 



2q^ 



E[ii£iin^ + 



+ ^SUp^K 



with a = a{6) is fixed in ]0; 1[ 



\meM / 



(14) 
(15) 



Theorem |3.1| is thus a direct application of Proposition |4.2[ Hence only remain to be 
checked the two assumptions Al and A2. 



4.2 Auxiliary concentration type lemmas 

Here we state some propositions required in the proofs of the previous results. 
To our knowledge, the first is due to von Bahr and Esseen in [llj. 

Lemma 4.3. Let Ui,...,Un independent centred variables with values in M. For any 
1 ^ K ^ 2 we have : 

n '^"1 n 

E ^f/, ^8^E[|a,r] 

j = l J 4 = 1 

The next proposition is proved in [3j. 

Proposition 4.4. Given N,k E N, let A E ^NkxNk\^ |g j ^g ^ non-negative definite 
and symmetric matrix and ei,...,eN i.i.d random vectors in M^ with E(£:i) = and 

y (£,) = $. Wntee= {ej , ...,eiy , C{e) = Ve^Ae, and 6^ = ^'^^^'f'T^^ ■ For all (3 > 2 
such that E ||£:i|| < 00 it holds that, for all a; > 0, 

1/9' 



¥\C{e)>5'Ti[A]+25\Ti{A)p[A]x + 5'p[A)x\<C2{P) 



where the constant C2 (/?) depends only on /3. 



E||eif Tr M 
5Pp (a) x/3/2 



(16) 



5 Appendix 



5.1 Proof of Proposition 4.2 



This proof follows the guidelines of the proof of Theorem 6.1 in [T]. The following lemma 
will be helpful for the proof of this proposition 

Lemma 5.1. Choose r] = i] (9) > and a = a (9) G ]0; 1[ such that (1 + 9) (1 — a) ^ 

2 



;i + 2r7). Set H^ if) 



f-f 
Then, for rriQ minimizing m ^-^ 11/ ~ /m||^ + ^^^m in rn E Ai 

1 



|2 



DrnS.2 

n m 



where k (9) 



2 + 51(1 + 



E [H^, Uf Id ^ K^X 



n<i 



:i7) 



where A^ was defined in Proposition ^.^ 



Proof. Lemma 5.1 



First, remark that on the set fi, for all m G A^ 

p^{m) ^(l-a)(l + ^)^=^^ (l + 2r/)'^'"^'" 



n 



n 



SlD„ 



which corresponds to the penalty of Proposition 



4.1 



Set pen{m) = (1 + 27]) 

The proof of this lemma is based on the proof of Proposition 4.1 in [3]. In fact, it is 
sufficient to prove that for each x > and k, >2 



^(nif)ln>(l + l) ^Sl) <c(.,,)E||.,r Y: j: 



1 B^yi 



{rjDrn + X) 



k/2' 



where we have set 

^(/) = 
Indeed, for each m E Ai, 



f-f 



2 + -){||/-/™Ji:+|^(mo)} 



:i^ 



\\f - fmo\\l+pe^{mo) = ||/-/mollJ + (l + 



'-D 



n 



mo 



<(1 + 



r^2 
\\J JmoWn ' -^mo 



then we get that for all g > 0, 



W if) In > Hl^^ (/) 1 



Using the equality 



/•oo 

E [W (/) in] = / qu^-^f {W if) In > u) du 
Jo 



(19) 



and following the proof of Propositon 4.1 in [3] we obtain the upper bound ( 17 ) of Lemma 

EH 



Now we turn to the proof of (18). For any g G M."-'^ we define the empirical quadratic 
loss function by 

in{g) = \\y-g\\l- 

Using the definition of 7„ we have that for all g G M"^ , 



11/ - 9\\l = In (g) + 2{g-y, e)^ + || e\\l 



and therefore 



/-/ 



11/ - PmJTn = ln[f]-ln (PmJ) + 2 ( / - P^J, 6 



Using the definition of /, we know that 



Inif) +pen{m) <-fn{g)+pen (mo) 



for all g G Smo- Then 



In if) -In {PmJ) < peu (mo) - pen (m) 



So we get from (20) and (21) that 



~ 2 ~ 

/-/ <||/-Pmo/||„+pen(mo) -pen(m) 

n 

+ 2{f-P^J, e)^ + 2{Pf,f-f, e)^ + 2{f-Pf,f, e 
In the following we set for each m' G A^, 

Bm' = {g & <Sra' : \\g\\n < 1}, 

Gm' = sup {g, s)^ = \\Pm' £:||„, 



llm' 







^™'^-^ if ||P„,/-/||^^0 
otherwise. 



11^™'/- 



Since f = Pff^ f+ P^ e, (|22j) gives 

2 



/-/ 



< ll/-^mo/ll„, +pen(mo) -pen{m) 



+ 2\\f-P^J\\J{u^„ e)J+2\\f-Pf,f\\J{ufn, e)J+2Gl. 



(20) 



(21) 



(22) 



(23) 



Using repeatedly the following elementary inequality that holds for all positive numbers 



2xz < vx H — z 

V 



(24) 



we get for any m' & A4 



1 



2 11/ - i^^'/IIJK', s)J <u\\f- P^,f\\l + -\{u^,, e)S 



(25) 



By Pythagora's Theorem we have 



/-/ =\\f-Pff.f\\l+ Pr~rJ-f 
n 

= \\f-P^f\\l + Gl. 



(26) 



We derive from (23) and (25) that for any z/ > 



/ - / < 11/ - Pnjt + HI/ - Prnjt + l {Umo, ^)' 



U 



+1^111 - Pmf\\n + -{ufh, ^)„ + 2G~ +pen(mo) -pen{m) 



Now taking into account that by equation (26) ||/ — PmfWn 
inequahty is equivalent to 

/-/ 



/-/ 



— GIj the above 



ly 



1 



n V 



+ - (m^, e)\ + (2 - z/) G^ + pen (mo) - ven (m) 



(27) 



We choose v = -^ G ]0, 1[, but for sake of simphcity we keep using the notation v. Let 
Pi and p2 be two functions depending on v mapping Jv[ into M^. They will be specified 
as in [3] to satisfy 



•pen (m) > (2 — h')pi {m) -\ — p2 {m) V(m') G Ai, 



V 



{2i 



Remember that on fi, pen{m) ^ pen{m) \/m G M.. Since -p2{jn') < pen{m') and 
2, we g 

/-/ 



1 + 1/ < 2, we get from (27) and (28) that on the set Vt 

' < (1 + H 11/ - PmJ\\l+mi{m,) + -p2 (mo) + (2 - z/) (G^ - pi (m)) 



1-z/ 



1 - ^2 ~ ,~^^ 1 ,, ,2 



+ - ((u^, e)„ - p2 ijn)) + - ((wmo, £)„ - P2 (mo)) 

< 2 (11/ - P„J||^ + p^ (mo)) + (2 - z/) (G| - n (m)) 

+ - ((u,7„ e)l - p2 (m)) + - {{umo,e)l - p2 (mo)) . 



As^ 

1 — !/ 



V 

2 + - we obtain that 



V 



(29) 



y)n{!)\, 



l-z/) 



:i-H 



/-/ 
/-/ 



- (1 - z/) (^2 + -J (11/ - P^JII^ +1^ (mo)) \ In 
-2(||/-P„J||J+|^(mo))j In 



1 - .>2 ~ ,~^^ 1 /, ,2 



<\{^-v) {G\ - pi (m)) + - {{uf^, £)„ - p2 (m)) + - ((m„, e)^ - pa (mo)) 



z/ 
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For any x > 0, 



< 5^ P [(2 - z.) (l|P„,el|^ - V. K)) > ^) 

m / C A// ^ 



m'eAl 



3ri 



XI A,m'(a;)+ X P2,m'(a;). 



(30) 



m'eA^ 



m'eA^ 



From now on, the proof of Lemma |5.1 is exactly the same as the end of the proof of 

D 



Proposition 4.1 in [3| with Lm = v- 



Proof. Proposition 4.2 



We first provide an upper bound for E 
chose as in Lemma [5.11 



/-/ 



29 



In 



, where the set ^2 depends on a 



As g ^ 1, we have (a + hy ^ a'' + h'^. Together with Lemma 5.1 we deduce that 

E 



/-/ 



29 



K sup q 



k{ef 



\\f_f II 2 I -^™o ?2 
11/ /molln"r 0^^ 



Using the convexity of a; h-)- x 9 together with the Jensen inequahty, we obtain 



E 



/-/ 



2q 



^ 2^/''-'Ajl- + 2i/''-iE 



■^ sup 



n 



k{9) 



W f — f II 2 I "^0 x2 
11/ JmoWn "T "mo 



and by using the assumption Al we have that 



E 



/-/ 



2g 



1 
) ' ^ 2V.-1 A.5L,^ + 2V^-i 



k{9) 



\\f - f ||2 + :^^2 

\\J ^molln T^ _^ "mo 



n 



(31) 



Now we need to find an upper bound for the quantity E 
First, remark that 



/-/ 



2q 



f-f 



\\f-P^y\\l = \\f-P^f\\l + \\Pf^if- 



^\\f-P^f\\l+\\4l=\\f\\l-\\Prnf\\l + \\e\ 



And thus 



/-/ 



^llfll' + lldl? 



11 



So we have 



E 



/-/ 



2q 



^\\f€^m+^[\w:^n^]- 



Using Holder's inequality with ^ > 1 we obtain 

E[lkll?inc]^E[||5||:;]^p(fi^)* 



,29 



1_22' 



But 



7^2 






and as K ^ 2, we can use Minkowsky's inequality to obtain 



E[ikii:]<^ X^(i?[ik.in)' 



, i=l 



na 



n(i?[ikiin)' 



that is 



So we have 



E[|kli:]^En|eJn. 



E 



/-/ 



2g 



< 



Em II 1^' 
[IfiII 



2q 



P(fic)(l-^) 



and with assumption A2 



E 



/-/ 



2g 



LQc 



< 



29 



E £1 « + 



C(a) 



n"^ 



_2£^ 



As 7 ^ ^_^ , , we deduce that 



E 



/- 


-/ 


2g 








n 


/ 



E[ikiin^ 



1-1^ 1 

n 



(32) 



To conclude, we use again the convexity of x h-> x? and the inequality (31) to get 



E 



" 


, 


2q^ 


\' 


r 2 


/- 


-/ 




< 45"^ 


E lleil""^ 


. 




n 


/ 


L 



(7(a) 



^2q 



n 



I 41/9-1A 5 



SMp 



n 



+4^/^-1^(0) 



I^. 



\\f - f r + =1^5^ 



n 



mo 



n 
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5.2 Proof of Theorem 13.11 

Recall that (3 > max (2 (1 + 2g) , 3 + 2g) and k G ]2 (1 + 2g) ; min (/3, 2/3 - 4) [. In order to 



use Proposition 4.2 , we need to prove the following inequalities 



Al. E 



S' 



^S: 



A2.Fm^Cia)^ior^^^^^. 



First we prove Al. 

Remember that S"^ = — ^^ — "^ — - — - . By using the linearity of the trace and the equal- 



ity E 



$ 



n-l 
n 



$, we obtain that E 



^^^—^(5^ which proves the result. 



For the second, write Vl^ = Um^M \^m ^ (^ ~ '^) ^m}- We bound up the quantity 
P f (5^ ^ {1 — a) S"^) in the following Proposition. 



Proposition 5.2. For all m E Ai , a g]0; 1[ and n ^ n{K, l3,a,Cinf,T.) we have for 
some constants Ci {f3), C2 (/?) .■ 



F(P^^(l-a)6i]^- fc2(/3)2^+^ + c,{P)X]E 



n< 






\xx 






This Proposition concludes the proof of A2 with 



C{a)= (C2(/3)2'^+^ + Ci(/3)— IE 

a2 






.TIl/3 



E ^"^'^^ 



m^M 



Proof. Proposition 5.2 



We start by dividing Vm = P ((5^ ^ (1 — a) 5^ 1 into two parts with one of them 
involving a sum of independent variables with expectation equal to 0. 



P„ = P Tr (n^ ® n„) $ U (1 - a) Tr ((n„ ® n™) $) 



Vm=F (Tr ((n„ ® lim) ($-($ + /i/i^) + ;,^TJ j ^ _^^^ ((^^ ® H^) $)) 



P™^p 



Tr (n„ ® n 



™^ 1 - X] {y^yi -^ - 1^1^^) + 1^1^^ - SyecS^ 



n 



i=l 



^aTr((n™®nj$) 



V < P 



Tr (n^ ® n. 






«, 



^ -Tr ((n^ ® n^) $) 
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«, 



-f[\Tt ((n„ ® n„) {fifi^ - s,,,sl^)) \ ^ -Tr ((n„ ® n^) $) 



Set 



Tr ( (n„ ® n„) I ^ ^ ( 



yi^i^ - * - M/^^) 



a. 



^-Tr((n^®nj$) 



and 



g2 = p (|Tr ((n„ ® n^) (/i/i^ - ^.ec^D) I ^ f Tr ((n„ ® n„) $) 



study of Ql 

First we use Markov's inequality to obtain 



2tE 



|Tr ((n„ ® n^) (i Er=i (rf - ^ - z^/^^))) I 



^1 ^ — 

(aTr ((n^ ® n^) $)) 

We must consider the two following cases : 
• If I ^ 2, Rosenthal's inequality gives 



E 



1 " 

- 5^Tr ((n„ ® n„) ((y,yf - $ - /i/i^))) 

Tr ((n„ ® n„) (^1^7 - $ - /i/i^)) 



i=l 



<C I ^] -^E 



-if) a- 



|Tr ((n„ ® n„) (yiy7 - $ - /i/i^)) I 



As I ^ 2, -^ — ^ -\ and we can use Jensen's inequality on the second term to 
obtain 



n^ n^ 



E 



1 " 

- ^Tr ((n„ ® n„) {{y,yf - $ - /./i^))) 

4 = 1 



<C(^1E 



If 1 ^ 'I ^ 2, we use Lemma 



Tr ((n„ ® n„) [yiyj - $ - /i^^)) 
of subsection 



/3 • 
77,4 



4.3 



4.2 



to get 



E 



< 



1 " 

- 5^Tr ((n„ ® n^) ((y.yf - $ - /i^u^))) 
Tr ((n^ n^) (i/ii/7 - "^ - /"/"^)) I 



i=l 



-E 



na 
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In both cases, we can use the fact that a; h- )■ xa is a convex and increasing function to 
obtain 



E 



|Tr ((n^ ® n^) {yiyj - $ - fifi^)) 



^22 






E 



\TT{{U^^Um){y,yJ)) 



|Tr((n„®nj(<l> + /i/iT)) 



And by using the Jensen's inequahty on the second term we have that 



E 



Tr ((n^ ® n„) {y,yj - $ - fifi^)) 
^T{{U^^UJ{y^yJ))\ 



^ 2^ E 



Now consider the following lemma. 
Lemma 5.3. // \1' is symmetric non-negative definite, then 

Tr((n„®nj^)G[0;Tr(^)] 

From this fact we get that 

|Tr ((n„ ® n„) {yiyj))\' ^ Tr {y,yj)^ 
In conclusion, we have 



(33) 



11/3 II T||/3 

yi\\ = \\xx 



Ql ^ Ci (/3) 



E 






a2dmDm 



(34) 



with 7 = min (f , f - l) and C^ (/3) = 2C (f ) if /3 ^ 4 where C (f ) is the constant in 
Rosenthal's inequality and Ci (/3) = 8 if 2 ^ /3 ^ 4. Remark that f ^ f and f - 1 ^ f , 

so 7 ^ f . 

Study of Q2 

Recall that 

Q2 = F (|Tr ((n„ ® n^) (/i/i^ _ 5„ec5D) I > f Tr ((H^ ® n„) $)) . 

Set 

B2 = Tr ((n„ ® n„) (^^T _ ^^^^^T^)) . 

Using the properties of the trace, we can write 

B2 = Tr ((n^ ® n^) (/i/i^)) - Tr ((H^ ® H^) (^.ec^L)) 

= Tr {fi^ {Um 8) n„) /i) - Tr (5^, (n„ ® H^) 5„ec) • 
But Ilm Ilm is an orthogonal projection matrix, then 
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B2 = Tr (;uT (n„ ® n„)^ (n^ ® n„) /i) - Tr (^sl, (n™ ® n^)^ (n„ ® n„) s.^c) 



B2 = II (n„ ® ujfxf - II (n™ ® nj s. 



B2 = (||(n„®n„)/i|| -||(n„®n„)5^ 

Hence 



(n^ (g) n^) /i|| + II (n„ ^ n^) s'j, 



|52| ^ II (n„ ® n™) (fi - s,ec)\\ (ll(n„ ® ujfx\\ + ||(n„ ® n„) s^edl) 

\B2\ ^ II (n™ ® n„) (/i - 5,ec)f + 2 II (n„ ® n^) (^ - s.JII II (n„ ® n„)^| 
|fi2| ^ II (n„ ® n^) ifi - s,ec)f + 2 II (n„ ® n„) (/i - 5,ec)|| M\ 



Finally 



Q2 <:f [\\{Um ® n„) (/i - ^,ec)|| ^ ^Tr ((n„ ® n„) $) 
+p(||(n^®n^)(/x-5,ec)|| ^ -^Tr((n^®n™)<i>) 

Now we need to provide an upper bound for the quantities 

p(||(n„®n^)(/i-5,ec)f ^t). 



(35) 



For this we will use the deviation bound provided by Proposition AA stated in sub- 
section 14.21 . 



Set 



Then 



Gn 



I Idp2 . . . Idp2 \ 
Idp2 . . . Idp2 



n 



\ 



Idp2 



. Idp2 J 

Gn (y - /) = In ® {Svec " /i) 



Now, if 



Km = idn ® (n„ (g) n^ 



/ n^ ® n. 











\ 





... n„, ® n. 



m <y '-'-m 



we have 



Hm (In ® {Svec - /")) = In ® ((11^ ® Tim) {Svec - /")) • 
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In conclusion, with 



/ iim<s>iim ... n„ ® n^ \ 



■^m -^m'^n 



n 



i-i-m ® J-J-m • • • J-J-m ® J-J-m 



v 



H™, H™, ... 11™, (g) n. 



m yy '-'-m 



J 



we have that 



Moreover, Am is an orthogonal projection matrix and we have the following equalities 

Um {y - f)f = n ||(n„ ® n„) (5,,, - /i)f = {y- fY a^ (y-f), 

n. 



Tr(A, 



-Tr n„ ® n™ = D 



n 



m '<y '-'-m) ^^m: 



n. 



Tr {Am {Idn ® $)) = -Tr ((n„ ® H^) $) = Tr ((H^ ® H^) $) . 



Now we can use Proposition 4.4 with A = Am, £i = yi — fi, Tr (Am) = Dm, p (Am) = 1, 
S^ = §1 and (3^2. 

This gives for all x > 



F[iy-f)' Amiy-f)>TT {{Um ® n^) $) 
that is 



1 + ./^^ 
^ V Dm 



^ C2(/3) 



E[||yi-Mll'^]gl 
Tr((n^(g)n™)*)Ts^ ' 



p ( II (n„ ® n„) (5,ec - /^) f ^ ^Tr ((n„ ® n„) $) [i + /^J ' ) ^ C2(/3) 

In order to use this deviation bound to obtain the inequalities 

p (||(n„ ® n„) ifi - s,ec)f > ^Tr ((n^ ® Um) $)) ^ c 



Tr{(n„®n„)<j>)7x^ 



^ 1 



and 



p ( ||(n^ ® n„) (/i - 5,ec)|| > ^^Tr ((n^ ® n„) $) ) < c^ 

o ||/x|| y ?T. ' 



with 7 ^ j^_^ , , we need to find a; > satisfying the three following facts 



, , a 1 / / X 

ymeM T^- l + A/TT- 

4 n \ V Drr 



Wm e M 



^) Tr((n„,®nj<i>)^ (-^) Cnf^-ii + ^h^ 
\\fi\\J v8||/u||y n V V ^r^ 



4^^c-^ 



/! ;3 



Tr((n„®n„)$)2x2 Cma;2 
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riT 



(36) 
(37) 
(38) 



have 



(36) and (37) hold for the choice x = Dm,n^ with r < 1 and if n is large enough to 

1 



1 + V^ ^ - 
n \ / 4 



and 



i(i+vi^y«( 



a 



8||/i| 



a 



inf 



(39) 
(40) 



In order to obtain (38) with x = D^nU^, we use the inequality D^ ^ n which gives 



< 



1 



2X2 



\Dln^P/^-^ 



Moreover 



E 



|yi-/^l 



^ E 



and by using properties of convexity we obtain 



E 



l2/i-/^l 



^ 2^"^ (^E 



|yi|l + 11/^1 



\yi\ 



m 



With the Jensen's inequality we get: 



E 



m - At| 



^ 2''E 



\yi\ 



f+2 



In conclusion, with r = ^^ < 1 we obtain for n ^ n(K, /3, a, Cj„/, S) 



g2^2^+^C2(/3)- 



E 



xx 



^/^ r) 2 n ' 



(41) 



where C2 (/3) is the constant which appears in Proposition 4.4 



In conclusion, combining (34) and (41) 



P f S ^ (1 - «) ^l] ^ :i [C,{m'^' + Ci (/3) i, 1 E 



n 



k/4 



Q!2 






/3n-2 



C"^ 



for n ^ n(fi;, /3, a, Ci„/, E) . 

To conclude, remark that f (1 - 2g//«) = ^ > ^ ^ g as g ^ 1. 



n 



Proof. Lemma 5.3 



Recall that Ilm^IIm is an orthogonal projection matrix. Hence there exists an orthogonal 
matrix P^ such that P^ (!!„ ® 11^) Pm = D, with D a diagonal matrix with Da = 1 if 
i ^ Djn, and Da = otherwise. Then if \E' is symmetric non-negative definite we have : 

Tr ((n„ ® n^) ^) = Tr (DP^^P^) 



p^ 



E E D^^ K^^™).z = E ^" K*^-)« 



Z=l k=l 



E(^™*^m),e[0;Tr(vi/)]. 



«=i 



Indeed, P^^Pm is non- negative definite so all its diagonal entries are non-negative. D 
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